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4 an audio pickup device for generating audio signals representative of sound from an 

5 audio source; and \ 

6 a multimodal integration architecture system for processing said image signals and said 

7 audio signals to determine a direction of the audio source relative to a reference point. 

1 2. (original) The videoconferencing system of claim 1 wherein said multimodal integration 

2 architecture system further comprises; 

3 an audio source localisation system; 

4 a computer vision person detection system; and 

5 a multimodal speaker deflection system. 

1 3. (original) The video conferencing system of claim 2, further comprising an integrated 

2 housing for an integrated video conferencing system incorporating the image pickup device, the 

3 audio pickup device, and the multimoaal integration architecture system. 

1 4, (original) The video conferencing Wstem of claim 3, wherein the integrated housing is 

2 sized for being portaMe, ' \ 

1 5. (original) The video con ferencing system of claim 2, further comprising an electronic pan 

2 tilt zoom system for electronically manipulating the image signals to effectively provide at least 

3 one of variable pan, tilt, and zoom functions. \ 

1 6. (original) The video conferencing system oflclaim 5, wherein the image pickup device is a 

2 stationary camera- . \ 

1 7. (original) The video conferencing system of claim 5, wherein the multimodal integrated 

2 architecture system provides control signals to the electronic pan tilt zoom system. 
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8, (original) The video conferencing system of claim 7, wherein the audio source moves 
relative to tlte reference point, the audio source localization system detects the movement of the 
audio sourcc.Vmd, in response to the movement, the audio source localization system causes a 
change in the field of view of the image pickup device. 


1 9. (original) The video conferencing system of claim 5, wherein the audio pickup device is 

2 comprised of an array of two microphones. 



10. (currently amended) A method comprising the steps of: 

generating, at a stationar y an image pickup device, remaining motionless durin g 
ope ration, image signals representative of an image; 

generating, at an audft^ pickup device, audio signals representative of sound from. an 
audio source; 

processing the image signals and the audio signals to determine a direction of the audio 
source relative to a reference ] 

manipulating the image signals to produce refined image signals; and 
outpuuing said refined image Signals, 


I 

2 

3 

4 
5 

6 

7 


I 1 . (original) The method of claim loVurther comprising the steps of: 

applying said audio signals to an audio source localization system; 

applying said image signals to a computer vision person detection system; 

processing said audio signals and said infyge signals with a multimodal speaker detection 
system; 

generating control signals based on the determined direction of the audio sourcb; c 

applying the control signals to an electronic pan lilt zoom system to mimic the effect of at 
least one function of a movable camera, said function selected from the group consisting 
panning, tilting, and zooming said movable camera; and 
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providinkan output from said electronic pan tilt zoom system. 


1 

2 


12. (original) The method of claim 10, further comprising electronical ly varying a field of 
view of the image pickup device in response to the control signals. 


1 

2 


1 3. (original) The mtethod of claim 10, wherein processing the audio signals includes 
determining an audio based direction of the audio source based on the audio signals. 


14, (original) The method, of claim 12, wherein the audio source moves relative to a 
reference point, and wherein processing the audio signals further includes: 

detecting ihe movement dS the audio source; and 

causing electronically, in response to the movement, an increase in the field of view of 
the image pickup device, 

15, (original) The method of claim\l2, further comprising the step of supplying control 
signals, based on the audio based direction, for electronically panning, tilting, or zooming said 
image pickup device. 
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16. (currently amended) A video conferencing system comprising: 

two microphones for generating audio sWals representative of sound from a speaker; 

a stationary video camera, remaining moo nless during operation, for generating video 
signals representative of a video image; 

^ an electronic pan tilt zoom system for manipulating video images to produce the visual 
effects of panning, tilting, and/or zooming; 

; a processor for processing tho video signals and\he audio signals to determine a direction 
of a speaker relative to a reference point and supplying control signals to the electronic pan tilt' ' 
zoom system for producing images that include the speakc\in the field of view of the camera, the 
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control signals being generalcdvbased on the determined direction of the speaker; and 
a transmitter for transmifcting audio and video signals for video conferencing. 
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